首页> 外文OA文献 >Quantifying the informativeness for biomedical literature summarization: An itemset mining method
【2h】

Quantifying the informativeness for biomedical literature summarization: An itemset mining method

机译:量化生物医学文献摘要的信息量:   项集挖掘方法

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Objective: Automatic text summarization tools can help users in thebiomedical domain to access information efficiently from a large volume ofscientific literature and other sources of text documents. In this paper, wepropose a summarization method that combines itemset mining and domainknowledge to construct a concept-based model and to extract the main subtopicsfrom an input document. Our summarizer quantifies the informativeness of eachsentence using the support values of itemsets appearing in the sentence.Methods: To address the concept-level analysis of text, our method initiallymaps the original document to biomedical concepts using the UMLS. Then, itdiscovers the essential subtopics of the text using a data mining technique,namely itemset mining, and constructs the summarization model. The employeditemset mining algorithm extracts a set of frequent itemsets containingcorrelated and recurrent concepts of the input document. The summarizer selectsthe most related and informative sentences and generates the final summary.Results: We evaluate the performance of our itemset-based summarizer using theRecall-Oriented Understudy for Gisting Evaluation (ROUGE) metrics, performing aset of experiments. The results show that the itemset-based summarizer performsbetter than the compared methods. The itemset-based summarizer achieves thebest scores for all the assessed ROUGE metrics . Conclusion: Compared to thestatistical, similarity, and word frequency methods, the proposed methoddemonstrates that the summarization model obtained from the concept extractionand itemset mining provides the summarizer with an effective metric formeasuring the informative content of sentences. This can lead to an improvementin the performance of biomedical literature summarization.
机译:目的:自动文本摘要工具可以帮助生物医学领域的用户从大量的科学文献和其他文本文档来源中有效地访问信息。在本文中,我们提出了一种汇总方法,该方法将项目集挖掘和领域知识相结合,以构建基于概念的模型,并从输入文档中提取主要子主题。我们的摘要器使用句子中出现的项目集的支持值来量化每个句子的信息性。方法:为了解决文本的概念级分析,我们的方法最初使用UMLS将原始文档映射到生物医学概念。然后,使用数据挖掘技术(即项目集挖掘)发现文本的基本子主题,并构建摘要模型。采用的项目集挖掘算法提取包含输入文档的相关和经常性概念的一组频繁项集。结果汇总器选择最相关且信息量最大的句子,并生成最终的摘要。结果:我们使用面向召回评估的粗化评估(ROUGE)指标评估我们基于项目集的汇总器的性能,并执行一系列实验。结果表明,基于项目集的汇总器的性能优于比较方法。基于项目集的汇总器在所有评估的ROUGE指标上均获得最高分。结论:与统计,相似度和词频方法相比,该方法证明了从概念提取和项目集挖掘获得的摘要模型为摘要提供了一种有效的量度句子内容的方法。这可以导致生物医​​学文献综述的性能提高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号